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A new Langevin- Verlet thermostat that preserves the fluctuation-dissipation relationship for dis- 
crete time steps, is applied to molecular modeling and tested against several popular suites (AMBER, 
GROMACS, LAMMPS) using a small molecule as an example that can be easily simulated by all 
three packages. Contrary to existing methods, the new thermostat exhibits no detectable changes in 
the sampling statistics as the time step is varied in the entire numerical stability range. The simple 
form of the method, which we express in the three common forms (Velocity- Explicit, Stormer- Verlet, 
and Leap- Frog), allows for easy implementation within existing molecular simulation packages to 
achieve faster and more accurate results with no cost in either computing time or programming 
complexity. 
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I. INTRODUCTION 



II. THE THREE VERLET EXPRESSIONS 



Recently, a new stochastic thermostat |l| , based on an 
exact implementation of the fluctuation-dissipation re- 
lationship in discrete time, was presented as integrated 
into the well-known and widely used Verlet formalism. 
It was analytically demonstrated that the method pro- 
vides exact thermodynamic response for both flat and 
harmonic potentials for any time step within the Verlet 
stability criteria. This unique thermodynamic feature of 
the approach, as implemented into the Verlet framework, 
makes it attractive for a wide range of applications, where 
it is desired to execute efficient and accurate simulations. 
Here, we demonstrate that the method is not limited to 
simple linear cases, but also extends to complex nonlinear 
systems. 

Langevin dynamic simulations constitute an appealing 
approach for simulations of physical systems in contact 
with a thermodynamic heat bath. A very popular class of 
such systems is molecular dynamics (MD) [2| of classical 
particle ensembles. The method is based on numerical 
integration of the Langevin equation 



(i) 



where r is the coordinate, v = r is its velocity, m is its 
mass, / is a net deterministic force acting on r. The 
friction constant a > and the noise (3 are connected by 
the fluctuation-dissipation relationship Q 
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(3) 



with ks and T being Boltzmann's constant and the ther- 
modynamic temperature, respectively. 



Verlet integrators are usually expressed in one of three 
forms |4[: A) velocity-explicit Verlet (VE), which ad- 
vances the trajectory one time step based on the co- 
ordinate and its conjugate variable (here the velocity), 
B) Stormer- Verlet (SV), which uses coordinates at two 
consecutive time steps to advance time, and C) leap- 
frog (LF), which advances the trajectory based on the 
coordinate and its conjugate variable, the latter being 
represented at half-integer time steps relative to the co- 
ordinate. The three typical Verlet formulations produce 
identical trajectories, and, thus, are different expressions 
of the exact same method. Consequently, applications 
of the Verlet method are commonly expressed in any of 
the available forms. Since all three identical methods 
are frequently used in the literature, we start here with 
the recently published G- JF thermostat that was derived 
in the natural VE form, and re-express the algorithm in 
the two other popular forms. The previous harmonic 
oscillator analysis [l[ of the method applies to all three 
variants since they produce identical results. Specifically, 
the three following formulations will result in the correct 
fluctuation-dissipation relationship and thus, the correct 
thermodynamic response in linear systems. 



A. Velocity Explicit G-JF 

We take our starting point with the VE G-JF expres- 
sions that were derived and analyzed in [l| . Denoting dis- 
crete time variables by the integer time step superscript, 
such that, e.g., r n = r(t n ), the algorithm for advancing 
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r n and v n one time step of dt reads 

r n+l = r n +bdty n + bd^ f n + bdt /3 n+l ,q 

2m 2m 

= au n + ^( a /« + /»+!) + ( 5 ) 

2m m 



where 



i adt -i 

2m ft = i 



2m 



adt 
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with the method, we replace u" 1 in Eq. (TlCfl) by inserting 
Eq. ©, such that 



V n = T 37 + i£r + 1^0 n 

b at 2m 2m 

2dtb + 4m lP P ] ' 

(13) 

Thus, Eqs. (|TTj). (|T2|I. and (HU) constitute the identical 
SV form of the VE expressions. 



and where 



t„ +I 

n+l _ / ai4-'\ j+i 



/3(f) dt' 
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is a standard Gaussian random number that satisfies 

(p n ) = , (/3"/3 z ) = 2ak B TdtS nJ . (8) 

Setting the initial conditions (r Q ,v°), Eqs. (HJ-dHJ) can 
be directly used to generate the trajectory (r n ,v n ) from 
which the dynamical and statistical information can be 
derived. 



C. Leap-Frog G-JF 

This version of the Verlet method for Langevin equa- 
tions comes with some flexibility in how the method is 
expressed. We take the starting point with the SV form 
given in Eq. (| 1 1|) and introduce a reasonable definition of 
the half-step velocity 



r n+l „n 

V n +2 = 

dt 



(14) 



We now use Eq. (JT3J) to replace r n+1 in Eq. (fTT|) to obtain 



B. Stormer- Verlet G-JF 

We here start by rewriting Eqs. ^ and ([5]) for n — > 
n - 1: 

r n = r"- 1 + bdtv"- 1 + ^-f 1 ' 1 + (9) 

2m 2m 

v n = av"- 1 + ^{af 11 - 1 + /") + —p n . (10) 
2m m 



As outlined in Ref. [1[, Eq. (1T01 can be inserted into 
Eq. ((U) in order to replace v n , whereafter Eq. © is used 
to replace the resulting This yields 
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bdt 2 .„ bdt 
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1 r "^ + ^r + 7i V + /3"+ 1 ), 

dt m 2m 
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in which we can again apply Eq. (|14[) and arrive at 



= av n -i + —f n + -^-(/3 n +j3 n+1 ), (16) 
m 2m 



This equation is the half-time step velocity propagator, 
which is complemented by Eq. (TTJJ to yield a LF G-JF 
method 



r n+l = r n +dtv n+^ 



(17) 



As was the case for the SV formulation of the method, 
this LF representation does not trivially incorporate the 
natural initial conditions (r°,v°). We therefore apply 
Eq. (fl4l for n = in combination with Eq. (|12l) , resulting 
in 



which is the SV formulation of the G-JF method. Unlike 
the VE expressions, the SV equation does not contain 
direct information about the velocity and is therefore not 
directly applicable for natural initial conditions (r°,v°). 
The self-consistent approach for starting this procedure 
from (r°, i>°) is to apply Eq. ([5]) for n = 0, 



(18) 



i . n bdt M b , 
v 2 = bv + — / + — j3 
2m 2m 



to be used, along with Eq. (Ti"2"j) . before applying Eqs. (TTJ 
and (JTTJ) for n > 0. The proper integer-step velocity can 
be found from combining Eqs. (| L0|) and Q, resulting in 



1 n , , n bdt 2 „ n b dt , , 

r 1 = r° + bdtv° + f + /3 1 , (12) 

2m 2m 

then apply Eq. ([TT]) for all subsequent time steps n > 0. 
In order to calculate both the complete dynamical trajec- 
tory and important thermodynamic quantities, one needs 
the velocity v n explicitly expressed as well. Consistent 



v n +2 = av n -? +2- 



dt 



2bv n + ((3 n+1 -f3 n ), 

2m 

(19) 



where we can then, again, use Eq. (|14l) to obtain 

v n = i-(v n+ ? +av n -i) + -^-{B n - /3 n+1 ) . (20) 
2b 4m 
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Thus, Eqs. (fl6)) . ([T7j). and (j20j) constitute a consistent 
LF method, where initial conditions (r , « ) are applied 
through Eqs. (JE) and (TT8|) . 

Notice that while the given SV method is uniquely con- 
nected to the VE expressions in both the coordinate {r n } 
and its evaluated velocity {v n }, the development of LF is 
not unique, even if LF is constructed to produce identical 
trajectories (r n , v n ) to those of VE and SV. The reason is 
the aforementioned somewhat ambiguous choice in defin- 
ing the half-step velocity w™+3 in Eq. (|T4"|). and the sub- 
sequent reconstruction of the integer-time velocity v n in 
Eq. (j2"0")) . Thus, one can exercise some freedom of choice 
in the velocity equations when using the LF expressions. 

For example, a sensible alternative to the half-step ve- 
locity in Eq. (|T4|) may be 



r.n+1 



bdt 



2m^ 



(21) 



where we use the symbol u for the revised definition of 
the half-step velocity. Following the procedure starting 
from Eq. (TH|) . we can develop the following alternate 
LF formulation, which also yields identical trajectories 
(r n ,v n ). The two equations for half-step velocity and 
integer-step position become 



m Im 



r n+l _ r n 



bdt 



,n+4 



1 

Ir 
bdt 
2m 



n+l 



(22) 
(23) 



with initiating half-step velocity and conversion to 
integer-step velocity v n written 



— f 

2m 



2 V 



(24) 
(25) 



respectively. As mentioned above, all LF formulations 
of the thermostat yield identical trajectories for (r n ,v n ) 
and therefore constitute the same method regardless of 
the specific definition of the half-step velocity. Choosing 
which one to use is entirely a matter of convenience. The 
defined half-step velocity is simply an auxiliary variable, 
which need not have any specific useful physical interpre- 
tation for the evaluation of thermodynamic quantities. 



III. TESTING THE METHOD 

We exemplify the applicability of the G-JF method in 
the context of a simple biomolecular simulation and we 
make comparison to results of the widely-used molecu- 
lar dynamics codes AMBER @, LAMMPS ||, E3, and 
GROMACS [ll|. These packages employ different com- 
monly used stochastic thermostats. More comprehensive 
discussions of available thermostats can be found in re- 
cent references [l], [l2| . The purpose of the following sim- 
ulations is not to present new or ground-breaking results 



in biomolecular science; instead, we choose a simple and 
well-understood representative model system that can 
illuminate, through simulations within well-established 
MD packages, some key features of the new algorithm 
as implemented into one of the codes that is particularly 
amenable to revisions. 



A. Simulation Details 

We performed classical molecular dynamics simula- 
tions of alanine dipeptide (illustrated in Figure [Ik.), a 
small and well-studied biomolecule. Intramolecular in- 
teractions among the solute atoms, including bond, an- 
gle, dihedral, and non-bonded energies, were modeled 
with an AMBER classical all-atom molecular mechanical 
treatment coupled with the recent ffl2 parameter set 
[y] , while the extra-molecular environment was treated as 
vacuum. 

This model system was simulated at a target tempera- 
ture of 300 K with four integration schemes for compar- 
ison: i) The BBK thermostat Q, which is implemented 
and expressed within AMBER 12 (ntt=3) in the leap- 
frog Verlet formulation Q; ii) the method of Schneider 
& Stoll as implemented in the LAMMPS simula- 
tion software [i| (with the combination of the nve and 
langevin fixes); Hi) a variation of the method of van Gun- 
steren & Berendsen [fl li~5|, im plemented in GROMACS 
[ill ] (as the sd integrator [Hj]), which includes velocity 
rescaling; and finally iv) the G-JF thermostat [l| . 

The latter was implemented into AMBER through 
small modifications to the AMBER 12 source code [f|. 
Since the underlying LF Verlet integrator used in the 
BBK method is consistent with Eqs. (TTB]), (fTT]), and pi . 
revisions were only necessary with regard to the fluctu- 
ation terms. The memory framework was modified to 
include space for the correlating noise term from the pre- 
vious time step, corresponding to f3 n in Eqs. (|16p and 
([20|) . to be cached for use in subsequent integration steps. 
A Langevin damping coefficient of 10 ps _1 was used 
throughout all the simulations. Simulation input data 
for LAMMPS and GROMACS runs were generated di- 
rectly from AMBER-formatted coordinate and topology 
files using the free tools amber 2lammps.py (distributed 
with the LAMMPS source code), and amb2gmx.pl [l7| . 
respectively. 

Data were collected from sets of ten independent 100 ps 
simulations at several values of the integration time step, 
from 0.5-3.1 fs in increments of 0.1 fs, and from 3.1-3.2 fs 
in increments of 0.01 fs. Each simulation was initiated 
from the same energy-minimized starting structure of the 
peptide. For each independent simulation, initial atomic 
velocities were assigned from a Maxwell-Boltzmann dis- 
tribution appropriate to the target temperature and the 
psuedo-random number chain was uniquely seeded. 

First and second moments of the total potential en- 
ergy distribution were chosen as measures of the sta- 
bility and accuracy with which each integration method 
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FIG. 1: (a) Space- filling model of the simulated alanine dipep- 
tide (Ac-Ala-NHMe) molecule, (b) Mean and (c) standard 
deviation of the potential energy in the model alanine dipep- 
tide, computed with different molecular dynamics codes and 
various time steps. Results are depicted as follows, top curve 
to bottom: Blue for LAMMPS; orange for AMBER; red for 
GROMACS; and green for AMBER with the G-JF implemen- 
tation. 



behavior, albeit seemingly with only one third devia- 
tion, is found for the thermostat of van Gunsteren & 
Berendsen, implemented into GROMACS. In contrast, 
the G-JF method, implemented into AMBER 12 as de- 
scribed above, shows the inherent feature of preserving 
the fluctuation-dissipation relationship for any time step. 
In fact, for these simulations, the G-JF method can give 
statistical estimates at all time steps up to the stability 
limit that are indistinguishable from those at small time 
steps. The stability limit is here identified by adiabati- 
cally increasing the time step of a simulation until sim- 
ulations produce anomalously high velocities, indicating 
that the Verlet integrator is no longer capable of pro- 
ducing a meaningful trajectory for the simulation. The 
existing stochastic thermostats seem stable up to 3.0 fs 
for this simulation, while the G-JF is stable for up to a 
slightly higher value 3.1 fs. 

We close this section by arguing for the use of potential 
energy (and its fluctuations) as a measure for the abil- 
ity of the different thermostats to correctly sample the 
phase space corresponding to the target temperature. A 
seemingly more straightforward test would be to consider 
the kinetic energy. However, as shown in Ref. [l[ for the 
harmonic oscillator case, the computed velocity v n , and 
thereby the kinetic energy, is increasingly depressed for 
increasing time step in any Verlet scheme, regardless of 
the inclusion of a thermostat. Thus, when evaluating a 
thermostat, measures of velocity and kinetic energy may 
not be appropriate for determining the quality of statis- 
tical sampling. Further, this observation may hint at sta- 
tistical sampling problems arising from thermostats that 
involve "velocity rescaling" as a way to obtain a desired 
temperature. 



could reproduce the statistical-mechanical properties of 
the physical system. Average energy and its fluctuation 
were computed based on samples at all integration steps. 
The total number of integration steps N was varied based 
on the time step dt, such that N = [100 ps/dt\. 

B. Results 

Figures [TJd and c summarize the results and shows the 
average total potential energy and its standard deviation 
as a function of the simulated time step for all the in- 
tegration methods. As expected, all methods give very 
similar results for both the average and standard devi- 
ation of the energies for as long as the time steps are 
smaller than about 1 fs. 

However, for increasing time steps the unmodified con- 
temporary codes start deviating from the expected sta- 
tistical values found for small dt. The two strongest de- 
viations are found for the BBK method, implemented 
in AMBER 12, and the thermostat of Schneider & 
Stoll, implemented into LAMMPS. The same deviating 



IV. DISCUSSION AND CONCLUSION 

The newly developed G-JF stochastic Verlet thermo- 
stat has been applied to simulations of a small, yet non- 
trivial and nonlinear, system representative of many ap- 
plications in molecular modeling. It has previously been 
analytically demonstrated that for linear systems the new 
method yields exact statistical behavior of diffusion and 
Boltzmann distributions for any time step leading to sta- 
ble dynamics [![. The simulations presented here indi- 
cate that these attractive and robust statistical features 
are likely to remain in other complex and nonlinear sys- 
tems. Specifically, our present simulations demonstrate 
that the statistical behavior of potential energy remains 
sound for any time step up to the limit where the atomic 
trajectories suddenly diverge. In contrast, available con- 
temporary molecular dynamics codes with other stochas- 
tic thermostats show deviating behavior for increasing 
time step, indicating that the interpretation of thermo- 
dynamic data from those algorithms must be done with 
caution and small time steps. This seems to be the case 
for the three popular molecular dynamics codes that we 
have investigated here, and is likely to be true also for 
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other available MD simulation codes. In order to fa- 
cilitate comparison between the stochastic thermostats, 
we use the same AMBER force fields in all four sets of 
simulations. We emphasize that simulations with the 
G-JF thermostat have been completed by implementing 
the new simple algorithm into an existing available code 
(AMBER 12), to ensure direct comparison between the 
thermostats without any other differences in parameters 
or simulation details. Thus, based on the preliminary 
tests and analyses, we suggest that the algorithm pre- 
sented here and in Ref. [l| be implemented into existing 
molecular dynamics codes for further use and evaluation. 
In order to facilitate such revisions, we have explicitly 
provided the algorithm in all three commonly used for- 



mulations of the Verlet method. 
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